πŸ•·οΈοΈ Job Radar β€’ SCRAPING

Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.

upwork.com 🟑 2026-06-01

πŸ”Ή Public Web Data Intake System Implementation
πŸ‘€ Client: πŸ‡ΊπŸ‡Έ United States Member since 2026-03-25
πŸ’° Price: ****
🚩 Problem: Need for maintainable, production-ready scrapers to populate an existing research pipeline from diverse public web sources.
πŸ“¦ Existing: Core repository, database schema, stub source adapters, and ingestion flow (Python + Supabase).

Specifications:

[Target]: Public directories, vendor pages, documentation, blogs, news, public databases, APIs, RSS, paginated search pages
[Method]: Incremental ingestion, deduplication via content hashing, pagination handling, rate limiting, retries
[Stack]: Python, Supabase/Postgres, requests, httpx, BeautifulSoup, trafilatura, scrapy, playwright
[Format]: Structured records containing source URL, title, raw text, metadata, timestamps, and content hashes
[Security]: Lawful collection of public data; no authentication bypass, paywalls, or CAPTCHA solving

Workflow:

1. Review existing repository and source-adapter interface
2. Implement production-quality source collector
3. Integrate data storage into Supabase
4. Validate deduplication on repeat execution

⚑ Receive notifications instantly Join our community.